Rapid speaker adaptation using regression-tree based spectral peak alignment

نویسندگان

  • Shizhen Wang
  • Xiaodong Cui
  • Abeer Alwan
چکیده

In this paper, regression-tree based spectral peak alignment is proposed for rapid speaker adaptation using the linearization of VTLN. Two different regression classes are investigated: phonetic classes (using combined knowledge and data-driven techniques) and mixture classes. Compared to MLLR and VTLN, improved performance can be obtained for both supervised and unsupervised adaptations on both medium vocabulary and connected digits recognition tasks. To further improve the performance, MLLR was integrated into this regression-tree based peak alignment. Experimental results show that the performance improvements can be achieved even with limited adaptation data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rapid Speaker Adaptation using Regression-

In this paper, regression-tree based spectral peak alignment is proposed for rapid speaker adaptation using the linearization of VTLN. Two different regression classes are investigated: phonetic classes (using combined knowledge and data-driven techniques) and mixture classes. Compared to MLLR and VTLN, improved performance can be obtained for both supervised and unsupervised adaptations on bot...

متن کامل

Speaker adaptation based on nonlinear spectral transform for speech recognition

This paper proposes a speaker adaptation technique using a nonlinear spectral transform based on GMMs. One of the most popular forms of speaker adaptation is based on linear transforms, e.g., MLLR. Although MLLR uses multiple transforms according to regression classes, only a single linear transform is applied to each state. The proposed method performs nonlinear speaker adaptation based on a n...

متن کامل

Speaker adaptation of an acoustic-to-articulatory inversion model using cascaded Gaussian mixture regressions

The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation trai...

متن کامل

Tree-based estimation of speaker characteristics for speech recognition

Speaker adaptation by means of adjustment of speaker characteristic properties, such as vocal tract length, has the important advantage compared to conventional adaptation techniques that the adapted models are guaranteed to be realistic if the description of the properties are. One problem with this approach is that the search procedure to estimate them is computationally heavy. We address the...

متن کامل

An improved minimum generation error based model adaptation for HMM-based speech synthesis

Aminimum generation error (MGE) criterion had been proposed for model training in HMM-based speech synthesis. In this paper, we apply the MGE criterion to model adaptation for HMM-based speech synthesis, and introduce an MGE linear regression (MGELR) based model adaptation algorithm, where the regression matrices used to transform source models are optimized so as to minimize the generation err...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006